k-NN Regression on Functional Data with Incomplete Observations
نویسندگان
چکیده
In this paper we study a general version of regression where each covariate itself is a functional data such as distributions or functions. In real applications, however, typically we do not have direct access to such data; instead only some noisy estimates of the true covariate functions/distributions are available to us. For example, when each covariate is a distribution, then we might not be able to directly observe these distributions, but it can be assumed that i.i.d. sample sets from these distributions are available. In this paper we present a general framework and a kNN based estimator for this regression problem. We prove consistency of the estimator and derive its convergence rates. We further show that the proposed estimator can adapt to the local intrinsic dimension in our case and provide a simple approach for choosing k. Finally, we illustrate the applicability of our framework with numerical experiments.
منابع مشابه
Rates of Uniform Consistency for k-NN Regression
We derive high-probability finite-sample uniform rates of consistency for k-NN regression that are optimal up to logarithmic factors under mild assumptions. We moreover show that kNN regression adapts to an unknown lower intrinsic dimension automatically. We then apply the k-NN regression rates to establish new results about estimating the level sets and global maxima of a function from noisy o...
متن کاملFunctional Modeling of Iranian Precipitation Based on Temperature and Humidity
Functional Data Analysis (FDA) has recently made considerable progress because of easier access to the data that are essentially in the form of curves. Modeling of Iranian precipitation based on temperature and humidity with continuous the essential nature of such phenomena that are continuous functions of time has not been done properly. The corresponding data are generally collected daily or ...
متن کاملThe roles of nearest neighbor methods in imputing missing data in forest inventory and monitoring databases
Almost universally, forest inventory and monitoring databases are incomplete, ranging from missing data for only a few records and a few variables, common for small land areas, to missing data for many observations and many variables, common for large land areas. For a wide variety of applications, nearest neighbor (NN) imputation methods have been developed to fill in observations of variables...
متن کاملComparing K Nearest Neighbours Methods and Linear Regression - Is There Reason To Select One Over the Other?
Non-parametric k nearest neighbours (k-nn) techniques are increasingly used in forestry problems, especially in remote sensing. Parametric regression analysis has the advantage of well-known statistical theory behind it, whereas the statistical properties of k-nn are less studied. In this study, we compared the relative performance of k-nn and linear regression in an experiment. We examined the...
متن کاملPrediction of the waste stabilization pond performance using linear multiple regression and multi-layer perceptron neural network: a case study of Birjand, Iran
Background: Data mining (DM) is an approach used in extracting valuable information from environmental processes. This research depicts a DM approach used in extracting some information from influent and effluent wastewater characteristic data of a waste stabilization pond (WSP) in Birjand, a city in Eastern Iran. Methods: Multiple regression (MR) and neural network (NN) models were examined u...
متن کامل